Quality assessment of dimensionality reduction: Rank-based criteria

نویسندگان

  • John Aldo Lee
  • Michel Verleysen
چکیده

Dimensionality reduction aims at providing low-dimensional representations of high-dimensional data sets. Many new nonlinear methods have been proposed for the last years, yet the question of their assessment and comparison remains open. This paper first reviews some of the existing quality measures that are based on distance ranking and K-ary neighborhoods. Next, the definition of the coranking matrix provides a tool for comparing the ranks in the initial data set and some low-dimensional embedding. Rank errors and concepts such as neighborhood intrusions and extrusions can then be associated with different blocks of the co-ranking matrix. Several quality criteria can be cast within this unifying framework; they are shown to involve one or several of these characteristic blocks. Following this line, simple criteria are proposed, which quantify two aspects of the embedding quality, namely its overall quality and its tendency to favor intrusions or extrusions. They are applied to several recent dimensionality reduction methods in two experiments, with both artificial and real data. & 2009 Elsevier B.V. All rights reserved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Rank-based quality assessment of nonlinear dimensionality reduction

Nonlinear dimensionality reduction aims at providing lowdimensional representions of high-dimensional data sets. Many new methods have been proposed in the recent years, but the question of their assessment and comparison remains open. This paper reviews some of the existing quality measures that are based on distance ranking and K-ary neighborhoods. Many quality criteria actually rely on the a...

متن کامل

Quality assessment of nonlinear dimensionality reduction based on K-ary neighborhoods

Nonlinear dimensionality reduction aims at providing low-dimensional representions of high-dimensional data sets. Many new methods have been recently proposed, but the question of their assessment and comparison remains open. This paper reviews some of the existing quality measures that are based on distance ranking and K-ary neighborhoods. In this context, the comparison of the ranks in the hi...

متن کامل

Scale-independent quality criteria for dimensionality reduction

Dimensionality reduction aims at representing high-dimensional data in low-dimensional spaces, in order to facilitate their visual interpretation. Many techniques exist, ranging from simple linear projections to more complex nonlinear transformations. The large variety of methods emphasizes the need of quality criteria that allow for fair comparisons between them. This paper extends previous wo...

متن کامل

A Monte Carlo-Based Search Strategy for Dimensionality Reduction in Performance Tuning Parameters

Redundant and irrelevant features in high dimensional data increase the complexity in underlying mathematical models. It is necessary to conduct pre-processing steps that search for the most relevant features in order to reduce the dimensionality of the data. This study made use of a meta-heuristic search approach which uses lightweight random simulations to balance between the exploitation of ...

متن کامل

How to Evaluate Dimensionality Reduction? - Improving the Co-ranking Matrix

The growing number of dimensionality reduction (DR) methods available for data visualization has recently inspired the development of quality assessment measures, in order to evaluate the resulting low-dimensional representation independently from a methods’ inherent criteria. Several (existing) quality measures can be (re)formulated based on the so-called co-ranking matrix, which subsumes all ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Neurocomputing

دوره 72  شماره 

صفحات  -

تاریخ انتشار 2009